Skip to content

Benchmark prs#1529

Draft
jprendes wants to merge 2 commits into
hyperlight-dev:mainfrom
jprendes:benchmark-prs
Draft

Benchmark prs#1529
jprendes wants to merge 2 commits into
hyperlight-dev:mainfrom
jprendes:benchmark-prs

Conversation

@jprendes

Copy link
Copy Markdown
Contributor

No description provided.

@jprendes jprendes force-pushed the benchmark-prs branch 2 times, most recently from 8ad9393 to 0e181c4 Compare June 12, 2026 10:43
@jprendes jprendes added the kind/enhancement For PRs adding features, improving functionality, docs, tests, etc. label Jun 12, 2026
@jprendes jprendes force-pushed the benchmark-prs branch 4 times, most recently from bc499ea to a277ac1 Compare June 17, 2026 14:48
@jprendes jprendes force-pushed the benchmark-prs branch 3 times, most recently from e660953 to 9ce9ed0 Compare June 18, 2026 11:22
@hyperlight-gh-bot

Copy link
Copy Markdown

Benchmark Results

kvm / amd (Linux) (❌ *1.81x slower* → 🚀 **2.41x faster**)

function_call_serialization

serialize_function_call deserialize_function_call
7.91 ms (❌ 1.81x slower) 10.23 ms (❌ 1.19x slower)

guest_calls

call_with_host_function call_with_restore call different_thread interrupt_latency
medium 23.92 µs (✅ 1.07x slower) 70.81 µs (✅ 1.13x faster) 12.77 µs (✅ 1.02x slower)
small 20.73 µs (✅ 1.11x faster) 65.39 µs (✅ 1.08x faster) 11.50 µs (✅ 1.08x faster)
default 23.58 µs (✅ 1.02x slower) 62.36 µs (✅ 1.00x slower) 12.91 µs (✅ 1.02x slower)
9.82 ms (❌ 1.48x slower) 22.93 µs (❌ 1.13x slower)
large 24.52 µs (✅ 1.07x slower) 89.47 µs (✅ 1.17x faster) 12.97 µs (✅ 1.02x slower)

guest_functions_with_large_parameters

guest_call_with_large_parameters
596.90 ms (✅ 1.07x slower)

sample_workloads

24K_in_8K_out_c 24K_in_8K_out_rust
22.33 µs (✅ 1.68x faster) 21.02 µs (🚀 1.86x faster)

sandboxes

create_uninitialized create_initialized create_uninitialized_and_drop create_initialized_and_drop
large 28.23 ms (🚀 2.31x faster) 27.82 ms (🚀 2.41x faster) 29.26 ms (🚀 2.25x faster) 39.64 ms (🚀 1.98x faster)
medium 8.62 ms (🚀 2.05x faster) 9.84 ms (🚀 1.97x faster) 9.12 ms (🚀 1.97x faster) 24.22 ms (✅ 1.27x faster)
small 2.18 ms (✅ 1.73x faster) 5.11 ms (✅ 1.03x slower) 2.43 ms (✅ 1.47x faster) 18.61 ms (❌ 1.19x slower)
default 403.76 µs (✅ 1.27x faster) 2.59 ms (❌ 1.43x slower) 449.76 µs (✅ 1.25x faster) 18.50 ms (❌ 1.59x slower)

shared_memory

copy_from_slice fill copy_to_slice
64MB 4.72 ms (✅ 1.11x slower) 3.03 ms (❌ 1.20x slower) 4.48 ms (✅ 1.10x slower)
1MB 40.44 µs (✅ 1.02x slower) 28.08 µs (✅ 1.30x faster) 38.55 µs (✅ 1.07x slower)

snapshots

restore create
medium 15.56 µs (✅ 1.03x faster) 28.75 ms (✅ 1.21x faster)
large 36.68 µs (✅ 1.41x faster) 100.35 ms (✅ 1.40x faster)
small 10.73 µs (❌ 1.13x slower) 2.93 ms (✅ 1.20x faster)
default 9.83 µs (✅ 1.03x faster) 302.77 µs (✅ 1.28x faster)

Summary

  • Biggest gain: sandboxes/create_initialized/large — 🚀 2.41x faster
  • Worst regression: function_call_serialization/serialize_function_call — ❌ 1.81x slower
kvm / intel (Linux) (❌ *1.88x slower* → 🚀 **7.91x faster**)

function_call_serialization

serialize_function_call deserialize_function_call
7.01 ms (❌ 1.23x slower) 10.82 ms (✅ 1.15x faster)

guest_calls

call_with_host_function call_with_restore call different_thread interrupt_latency
medium 39.32 µs (✅ 1.04x slower) 43.21 µs (✅ 1.03x slower) 20.59 µs (✅ 1.07x slower)
small 39.81 µs (✅ 1.07x slower) 37.25 µs (✅ 1.01x faster) 20.58 µs (✅ 1.10x slower)
default 36.34 µs (✅ 1.03x faster) 36.97 µs (✅ 1.02x faster) 21.09 µs (✅ 1.09x slower)
10.90 ms (❌ 1.64x slower) 14.60 µs (🚀 1.85x faster)
large 37.04 µs (✅ 1.05x slower) 80.05 µs (✅ 1.04x slower) 19.39 µs (✅ 1.00x slower)

guest_functions_with_large_parameters

guest_call_with_large_parameters
653.71 ms (✅ 1.04x faster)

sample_workloads

24K_in_8K_out_c 24K_in_8K_out_rust
34.89 µs (🚀 3.07x faster) 34.42 µs (🚀 3.24x faster)

sandboxes

create_uninitialized create_initialized create_uninitialized_and_drop create_initialized_and_drop
large 66.06 ms (🚀 2.32x faster) 70.78 ms (🚀 2.20x faster) 68.71 ms (🚀 2.24x faster) 80.86 ms (🚀 2.08x faster)
medium 18.34 ms (🚀 2.21x faster) 20.73 ms (🚀 2.07x faster) 18.33 ms (🚀 2.20x faster) 36.35 ms (✅ 1.51x faster)
small 2.30 ms (✅ 1.55x faster) 4.28 ms (✅ 1.29x faster) 2.39 ms (✅ 1.53x faster) 21.46 ms (❌ 1.24x slower)
default 408.44 µs (✅ 1.20x faster) 2.19 ms (✅ 1.05x slower) 416.68 µs (✅ 1.29x faster) 21.34 ms (❌ 1.88x slower)

shared_memory

copy_from_slice fill copy_to_slice
64MB 6.08 ms (🚀 1.85x faster) 3.03 ms (✅ 1.02x faster) 4.73 ms (✅ 1.17x faster)
1MB 49.10 µs (✅ 1.04x slower) 29.03 µs (✅ 1.25x faster) 44.40 µs (✅ 1.11x slower)

snapshots

restore create
medium 15.95 µs (✅ 1.04x faster) 44.43 ms (✅ 1.68x faster)
large 67.05 µs (🚀 7.91x faster) 173.59 ms (✅ 1.70x faster)
small 11.79 µs (✅ 1.03x slower) 2.71 ms (✅ 1.46x faster)
default 11.17 µs (✅ 1.00x slower) 301.70 µs (✅ 1.27x faster)

Summary

  • Biggest gain: snapshots/restore/large — 🚀 7.91x faster
  • Worst regression: sandboxes/create_initialized_and_drop/default — ❌ 1.88x slower
mshv3 / amd (Linux) (❌ *2.19x slower* → 🚀 **2.85x faster**)

function_call_serialization

deserialize_function_call serialize_function_call
12.21 ms (❌ 1.77x slower) 8.42 ms (❌ 1.59x slower)

guest_calls

call_with_host_function call_with_restore call different_thread interrupt_latency
default 67.83 µs (✅ 1.35x faster) 154.02 µs (✅ 1.07x faster) 39.15 µs (✅ 1.35x faster)
small 69.99 µs (✅ 1.30x faster) 154.02 µs (✅ 1.02x slower) 41.88 µs (✅ 1.33x faster)
large 61.27 µs (✅ 1.49x faster) 213.11 µs (✅ 1.09x faster) 39.94 µs (✅ 1.39x faster)
54.00 µs (✅ 1.42x faster) 45.40 µs (✅ 1.09x faster)
medium 69.47 µs (✅ 1.23x faster) 173.44 µs (✅ 1.04x slower) 43.01 µs (✅ 1.32x faster)

guest_functions_with_large_parameters

guest_call_with_large_parameters
86.23 ms (❌ 1.15x slower)

sample_workloads

24K_in_8K_out_rust 24K_in_8K_out_c
63.91 µs (✅ 1.33x faster) 60.65 µs (✅ 1.33x faster)

sandboxes

create_initialized_and_drop create_uninitialized_and_drop create_uninitialized create_initialized
medium 94.72 ms (❌ 1.52x slower) 10.57 ms (🚀 2.46x faster) 10.34 ms (🚀 2.49x faster) 12.61 ms (🚀 2.25x faster)
small 91.72 ms (❌ 2.19x slower) 2.38 ms (✅ 1.71x faster) 2.19 ms (✅ 1.77x faster) 3.90 ms (✅ 1.49x faster)
large 184.48 ms (❌ 1.28x slower) 33.93 ms (🚀 2.85x faster) 33.59 ms (🚀 2.81x faster) 39.90 ms (🚀 2.58x faster)
default 48.76 ms (❌ 1.41x slower) 516.20 µs (✅ 1.26x faster) 463.74 µs (✅ 1.32x faster) 1.40 ms (✅ 1.09x faster)

shared_memory

copy_from_slice copy_to_slice fill
64MB 5.30 ms (✅ 1.06x faster) 5.18 ms (✅ 1.06x slower) 4.11 ms (✅ 1.10x slower)
1MB 42.62 µs (✅ 1.03x slower) 42.48 µs (✅ 1.01x faster) 41.49 µs (✅ 1.00x slower)

snapshots

restore create
default 82.36 µs (✅ 1.02x faster) 358.92 µs (✅ 1.42x faster)
large 1.80 ms (🚀 1.86x faster) 196.62 ms (✅ 1.22x faster)
medium 94.28 µs (✅ 1.30x faster) 47.91 ms (✅ 1.20x faster)
small 85.34 µs (✅ 1.03x faster) 2.78 ms (✅ 1.59x faster)

Summary

  • Biggest gain: sandboxes/create_uninitialized_and_drop/large — 🚀 2.85x faster
  • Worst regression: sandboxes/create_initialized_and_drop/small — ❌ 2.19x slower
mshv3 / intel (Linux) (❌ *1.56x slower* → 🚀 **1.89x faster**)

function_call_serialization

deserialize_function_call serialize_function_call
14.56 ms (✅ 1.05x slower) 11.73 ms (❌ 1.12x slower)

guest_calls

different_thread call_with_restore interrupt_latency call call_with_host_function
87.77 µs (✅ 1.43x faster) 70.06 µs (✅ 1.23x faster)
medium 264.19 µs (✅ 1.07x faster) 78.33 µs (✅ 1.31x faster) 120.92 µs (✅ 1.39x faster)
default 245.26 µs (✅ 1.13x faster) 79.13 µs (✅ 1.33x faster) 121.14 µs (✅ 1.35x faster)
small 251.80 µs (✅ 1.09x faster) 79.76 µs (✅ 1.34x faster) 124.74 µs (✅ 1.38x faster)
large 348.65 µs (✅ 1.18x faster) 80.72 µs (✅ 1.28x faster) 115.04 µs (✅ 1.52x faster)

guest_functions_with_large_parameters

guest_call_with_large_parameters
113.56 ms (✅ 1.13x faster)

sample_workloads

24K_in_8K_out_rust 24K_in_8K_out_c
98.22 µs (✅ 1.47x faster) 93.62 µs (✅ 1.49x faster)

sandboxes

create_initialized_and_drop create_uninitialized_and_drop create_uninitialized create_initialized
large 230.76 ms (❌ 1.49x slower) 63.71 ms (✅ 1.66x faster) 63.13 ms (✅ 1.69x faster) 70.62 ms (✅ 1.59x faster)
small 59.00 ms (❌ 1.40x slower) 2.56 ms (✅ 1.76x faster) 2.31 ms (🚀 1.89x faster) 5.07 ms (✅ 1.28x faster)
medium 80.48 ms (❌ 1.27x slower) 17.28 ms (✅ 1.63x faster) 17.35 ms (✅ 1.63x faster) 20.46 ms (✅ 1.54x faster)
default 51.96 ms (❌ 1.56x slower) 386.46 µs (✅ 1.32x faster) 355.91 µs (✅ 1.38x faster) 1.52 ms (✅ 1.10x faster)

shared_memory

copy_to_slice copy_from_slice fill
1MB 41.58 µs (✅ 1.07x slower) 70.21 µs (❌ 1.12x slower) 34.91 µs (✅ 1.08x faster)
64MB 8.69 ms (✅ 1.09x faster) 11.42 ms (✅ 1.01x slower) 7.51 ms (✅ 1.03x slower)

snapshots

create restore
medium 42.42 ms (✅ 1.29x faster) 182.54 µs (❌ 1.14x slower)
default 358.88 µs (✅ 1.27x faster) 160.07 µs (✅ 1.02x slower)
small 4.11 ms (✅ 1.45x faster) 157.98 µs (✅ 1.03x slower)
large 165.26 ms (✅ 1.27x faster) 1.80 ms (✅ 1.45x faster)

Summary

  • Biggest gain: sandboxes/create_uninitialized/small — 🚀 1.89x faster
  • Worst regression: sandboxes/create_initialized_and_drop/default — ❌ 1.56x slower
hyperv-ws2025 / amd (Windows) (❌ *4.96x slower* → ✅ **1.42x faster**)

function_call_serialization

deserialize_function_call serialize_function_call
11.54 ms (❌ 1.18x slower) 12.55 ms (❌ 1.46x slower)

guest_calls

call call_with_host_function call_with_restore different_thread interrupt_latency
default 54.08 µs (❌ 1.42x slower) 92.17 µs (❌ 1.37x slower) 208.67 µs (❌ 1.64x slower)
large 59.01 µs (❌ 1.46x slower) 120.51 µs (❌ 1.78x slower) 445.99 µs (❌ 1.79x slower)
medium 54.22 µs (❌ 1.42x slower) 107.02 µs (❌ 1.60x slower) 276.33 µs (❌ 1.68x slower)
small 53.11 µs (❌ 1.39x slower) 108.09 µs (❌ 1.71x slower) 206.86 µs (❌ 1.39x slower)
84.65 µs (✅ 1.10x slower) 148.91 µs (❌ 4.96x slower)

guest_functions_with_large_parameters

guest_call_with_large_parameters
4.92 s (❌ 1.49x slower)

sample_workloads

24K_in_8K_out_c 24K_in_8K_out_rust
94.42 µs (❌ 1.30x slower) 90.34 µs (❌ 1.35x slower)

sandboxes

create_initialized_and_drop create_initialized create_uninitialized_and_drop create_uninitialized
default 7.06 ms (❌ 1.41x slower) 6.89 ms (❌ 1.62x slower) 1.68 ms (❌ 1.37x slower) 1.90 ms (❌ 1.29x slower)
large 288.00 ms (✅ 1.17x faster) 239.45 ms (✅ 1.25x faster) 393.73 ms (✅ 1.05x slower) 315.94 ms (✅ 1.10x faster)
medium 96.81 ms (❌ 1.14x slower) 88.33 ms (❌ 1.17x slower) 78.66 ms (✅ 1.21x faster) 61.84 ms (✅ 1.42x faster)
small 19.84 ms (❌ 1.28x slower) 13.62 ms (✅ 1.01x slower) 10.73 ms (✅ 1.18x faster) 8.73 ms (✅ 1.37x faster)

shared_memory

copy_from_slice copy_to_slice fill
1MB 52.17 µs (❌ 1.22x slower) 53.45 µs (❌ 1.14x slower) 43.71 µs (✅ 1.05x slower)
64MB 10.77 ms (❌ 1.29x slower) 13.19 ms (❌ 1.40x slower) 6.89 ms (❌ 1.11x slower)

snapshots

create restore
default 975.92 µs (❌ 1.34x slower) 126.94 µs (❌ 1.68x slower)
large 423.02 ms (❌ 1.12x slower) 65.22 ms (❌ 1.67x slower)
medium 89.67 ms (✅ 1.00x slower) 1.12 ms (❌ 4.23x slower)
small 11.11 ms (✅ 1.02x faster) 136.47 µs (❌ 1.99x slower)

Summary

  • Biggest gain: sandboxes/create_uninitialized/medium — ✅ 1.42x faster
  • Worst regression: guest_calls/interrupt_latency — ❌ 4.96x slower
hyperv-ws2025 / intel (Windows) (❌ *6.68x slower* → ✅ **1.31x faster**)

function_call_serialization

deserialize_function_call serialize_function_call
15.57 ms (✅ 1.06x slower) 12.67 ms (✅ 1.08x slower)

guest_calls

call call_with_host_function call_with_restore different_thread interrupt_latency
default 87.22 µs (❌ 1.17x slower) 123.51 µs (✅ 1.07x slower) 260.67 µs (✅ 1.08x slower)
large 82.18 µs (✅ 1.06x faster) 135.39 µs (❌ 1.16x slower) 790.72 µs (❌ 1.51x slower)
medium 82.22 µs (✅ 1.00x faster) 122.83 µs (✅ 1.07x faster) 351.26 µs (❌ 1.19x slower)
small 85.73 µs (✅ 1.08x slower) 123.63 µs (✅ 1.01x slower) 262.66 µs (✅ 1.07x slower)
101.76 µs (✅ 1.11x faster) 161.41 µs (❌ 2.44x slower)

guest_functions_with_large_parameters

guest_call_with_large_parameters
5.71 s (❌ 1.22x slower)

sample_workloads

24K_in_8K_out_c 24K_in_8K_out_rust
105.93 µs (✅ 1.09x faster) 107.90 µs (✅ 1.08x faster)

sandboxes

create_initialized_and_drop create_initialized create_uninitialized_and_drop create_uninitialized
default 8.05 ms (✅ 1.04x slower) 6.77 ms (❌ 1.16x slower) 1.34 ms (✅ 1.05x slower) 1.03 ms (✅ 1.09x faster)
large 398.19 ms (✅ 1.01x slower) 313.32 ms (✅ 1.04x faster) 399.59 ms (❌ 1.14x slower) 292.09 ms (✅ 1.05x faster)
medium 104.07 ms (✅ 1.00x slower) 81.05 ms (✅ 1.08x faster) 100.29 ms (❌ 1.15x slower) 75.12 ms (✅ 1.05x faster)
small 21.06 ms (✅ 1.01x slower) 16.71 ms (✅ 1.04x faster) 14.14 ms (❌ 1.16x slower) 12.65 ms (❌ 1.19x slower)

shared_memory

copy_from_slice copy_to_slice fill
1MB 69.43 µs (✅ 1.04x slower) 71.10 µs (✅ 1.04x slower) 35.20 µs (✅ 1.05x slower)
64MB 14.48 ms (✅ 1.01x slower) 19.65 ms (❌ 1.21x slower) 10.98 ms (✅ 1.10x slower)

snapshots

create restore
default 759.73 µs (✅ 1.31x faster) 151.57 µs (✅ 1.00x slower)
large 428.73 ms (✅ 1.11x faster) 76.93 ms (❌ 1.34x slower)
medium 110.64 ms (✅ 1.10x faster) 6.02 ms (❌ 6.68x slower)
small 14.38 ms (✅ 1.09x faster) 169.71 µs (❌ 1.19x slower)

Summary

  • Biggest gain: snapshots/create/default — ✅ 1.31x faster
  • Worst regression: snapshots/restore/medium — ❌ 6.68x slower

jprendes added 2 commits June 18, 2026 14:15
Introduce a new internal tooling crate (hyperlight-ci) that provides:

- bench subcommand: Runs criterion benchmarks in parallel via
  criterion-swarm. Features include:
  - Configurable parallelism (-j N, defaults to all P-cores)
  - Configurable output modes (spinner, stream, summary)
  - Support for pre-built binaries (--binary) to skip rebuilds
  - Trailing args forwarded to criterion (filter, --exact, etc.)

- bench-report subcommand: Generates markdown comparison tables from
  criterion's target/criterion/ JSON output via criterion-markdown.
  Features include:
  - Benchmark discovery via criterion-swarm
  - Optional allowlist filtering via --binary or trailing args
  - Output to stdout

This replaces ad-hoc benchmark scripting with a unified tool suitable
for both local development and CI report generation.

Signed-off-by: Jorge Prendes <jorge.prendes@gmail.com>
- Add cargo alias (`cargo ci`) for convenient hyperlight-ci invocation
- Update dep_benchmarks workflow to use `cargo ci bench` and generate
  a markdown report via `cargo ci bench-report`, posting results as
  a PR comment per hypervisor/cpu matrix entry
- Add benchmarks job to ValidatePullRequest workflow with hypervisor
  and cpu matrix, gated behind docs-only and build-guests checks
- Grant pull-requests: write permission for PR comment posting
- Simplify Justfile bench recipes to delegate to `cargo ci bench`
- Update benchmarking docs to reflect the new workflow

Signed-off-by: Jorge Prendes <jorge.prendes@gmail.com>
@hyperlight-gh-bot

Copy link
Copy Markdown

Benchmark Results

kvm / amd (Linux) (❌ *1.92x slower* → 🚀 **2.44x faster**)

function_call_serialization

serialize_function_call deserialize_function_call
8.39 ms (❌ 1.92x slower) 8.17 ms (✅ 1.05x faster)

guest_calls

call_with_host_function call_with_restore call different_thread interrupt_latency
medium 23.61 µs (✅ 1.06x slower) 70.44 µs (✅ 1.14x faster) 12.76 µs (✅ 1.02x slower)
small 20.75 µs (✅ 1.10x faster) 65.94 µs (✅ 1.08x faster) 11.44 µs (✅ 1.10x faster)
default 24.01 µs (✅ 1.04x slower) 61.98 µs (✅ 1.01x faster) 12.75 µs (✅ 1.01x slower)
9.69 ms (❌ 1.46x slower) 25.97 µs (❌ 1.28x slower)
large 24.42 µs (✅ 1.06x slower) 89.09 µs (✅ 1.17x faster) 12.94 µs (✅ 1.03x slower)

guest_functions_with_large_parameters

guest_call_with_large_parameters
598.13 ms (✅ 1.07x slower)

sample_workloads

24K_in_8K_out_c 24K_in_8K_out_rust
20.93 µs (🚀 1.81x faster) 22.34 µs (✅ 1.72x faster)

sandboxes

create_uninitialized create_initialized create_uninitialized_and_drop create_initialized_and_drop
large 28.89 ms (🚀 2.25x faster) 27.47 ms (🚀 2.44x faster) 29.09 ms (🚀 2.27x faster) 38.79 ms (🚀 2.02x faster)
medium 8.75 ms (🚀 2.02x faster) 9.90 ms (🚀 1.96x faster) 9.02 ms (🚀 1.99x faster) 24.61 ms (✅ 1.25x faster)
small 2.28 ms (✅ 1.65x faster) 3.66 ms (✅ 1.36x faster) 2.32 ms (✅ 1.54x faster) 19.58 ms (❌ 1.25x slower)
default 403.02 µs (✅ 1.28x faster) 1.86 ms (✅ 1.03x slower) 445.02 µs (✅ 1.27x faster) 18.91 ms (❌ 1.63x slower)

shared_memory

copy_from_slice fill copy_to_slice
64MB 4.70 ms (✅ 1.10x slower) 2.96 ms (❌ 1.18x slower) 5.50 ms (❌ 1.35x slower)
1MB 40.14 µs (✅ 1.02x slower) 28.00 µs (✅ 1.30x faster) 41.95 µs (❌ 1.11x slower)

snapshots

restore create
medium 15.99 µs (✅ 1.06x faster) 30.21 ms (✅ 1.16x faster)
large 36.55 µs (✅ 1.46x faster) 98.60 ms (✅ 1.42x faster)
small 11.40 µs (✅ 1.08x slower) 2.88 ms (✅ 1.22x faster)
default 9.43 µs (✅ 1.06x faster) 305.87 µs (✅ 1.29x faster)

Summary

  • Biggest gain: sandboxes/create_initialized/large — 🚀 2.44x faster
  • Worst regression: function_call_serialization/serialize_function_call — ❌ 1.92x slower
kvm / intel (Linux) (❌ *1.70x slower* → 🚀 **5.39x faster**)

function_call_serialization

serialize_function_call deserialize_function_call
7.15 ms (❌ 1.25x slower) 9.96 ms (✅ 1.25x faster)

guest_calls

call_with_host_function call_with_restore call different_thread interrupt_latency
medium 38.21 µs (✅ 1.01x slower) 39.63 µs (✅ 1.04x faster) 19.73 µs (✅ 1.01x slower)
small 34.55 µs (✅ 1.07x faster) 35.37 µs (✅ 1.06x faster) 19.86 µs (✅ 1.07x slower)
default 38.21 µs (✅ 1.03x slower) 35.90 µs (✅ 1.06x faster) 19.70 µs (✅ 1.03x slower)
10.29 ms (❌ 1.55x slower) 15.21 µs (✅ 1.78x faster)
large 38.51 µs (✅ 1.09x slower) 75.44 µs (✅ 1.02x faster) 18.43 µs (✅ 1.05x faster)

guest_functions_with_large_parameters

guest_call_with_large_parameters
638.15 ms (✅ 1.07x faster)

sample_workloads

24K_in_8K_out_c 24K_in_8K_out_rust
30.93 µs (🚀 3.40x faster) 31.88 µs (🚀 3.47x faster)

sandboxes

create_uninitialized create_initialized create_uninitialized_and_drop create_initialized_and_drop
large 75.42 ms (🚀 2.03x faster) 78.01 ms (🚀 2.00x faster) 74.83 ms (🚀 2.06x faster) 91.67 ms (🚀 1.84x faster)
medium 19.55 ms (🚀 2.08x faster) 22.36 ms (🚀 1.92x faster) 19.71 ms (🚀 2.04x faster) 35.27 ms (✅ 1.56x faster)
small 2.21 ms (✅ 1.61x faster) 4.23 ms (✅ 1.31x faster) 2.41 ms (✅ 1.52x faster) 20.69 ms (❌ 1.20x slower)
default 376.93 µs (✅ 1.30x faster) 2.05 ms (✅ 1.02x faster) 407.56 µs (✅ 1.32x faster) 19.27 ms (❌ 1.70x slower)

shared_memory

copy_from_slice fill copy_to_slice
64MB 6.08 ms (🚀 1.85x faster) 3.39 ms (✅ 1.10x slower) 4.56 ms (✅ 1.22x faster)
1MB 47.28 µs (✅ 1.00x faster) 27.83 µs (✅ 1.31x faster) 44.24 µs (✅ 1.08x slower)

snapshots

restore create
medium 15.12 µs (✅ 1.11x faster) 47.89 ms (✅ 1.56x faster)
large 66.00 µs (🚀 5.39x faster) 193.81 ms (✅ 1.53x faster)
small 11.61 µs (✅ 1.03x slower) 2.72 ms (✅ 1.46x faster)
default 10.68 µs (✅ 1.02x faster) 299.04 µs (✅ 1.30x faster)

Summary

  • Biggest gain: snapshots/restore/large — 🚀 5.39x faster
  • Worst regression: sandboxes/create_initialized_and_drop/default — ❌ 1.70x slower
mshv3 / amd (Linux) (❌ *1.98x slower* → 🚀 **2.60x faster**)

function_call_serialization

deserialize_function_call serialize_function_call
13.03 ms (❌ 1.89x slower) 9.02 ms (❌ 1.71x slower)

guest_calls

call_with_host_function call_with_restore call different_thread interrupt_latency
default 68.32 µs (✅ 1.35x faster) 133.63 µs (✅ 1.22x faster) 39.51 µs (✅ 1.34x faster)
small 66.41 µs (✅ 1.37x faster) 150.98 µs (✅ 1.00x faster) 44.17 µs (✅ 1.25x faster)
large 70.43 µs (✅ 1.29x faster) 208.55 µs (✅ 1.12x faster) 40.19 µs (✅ 1.39x faster)
57.63 µs (✅ 1.34x faster) 44.31 µs (✅ 1.12x faster)
medium 70.11 µs (✅ 1.28x faster) 160.10 µs (✅ 1.04x faster) 42.22 µs (✅ 1.31x faster)

guest_functions_with_large_parameters

guest_call_with_large_parameters
86.99 ms (❌ 1.16x slower)

sample_workloads

24K_in_8K_out_rust 24K_in_8K_out_c
57.44 µs (✅ 1.49x faster) 61.58 µs (✅ 1.31x faster)

sandboxes

create_initialized_and_drop create_uninitialized_and_drop create_uninitialized create_initialized
medium 106.92 ms (❌ 1.71x slower) 11.47 ms (🚀 2.27x faster) 11.05 ms (🚀 2.32x faster) 13.89 ms (🚀 2.04x faster)
small 82.80 ms (❌ 1.98x slower) 2.42 ms (✅ 1.68x faster) 2.17 ms (✅ 1.79x faster) 4.42 ms (✅ 1.31x faster)
large 198.04 ms (❌ 1.37x slower) 37.85 ms (🚀 2.56x faster) 36.37 ms (🚀 2.60x faster) 43.58 ms (🚀 2.36x faster)
default 49.51 ms (❌ 1.44x slower) 497.14 µs (✅ 1.29x faster) 459.41 µs (✅ 1.33x faster) 1.59 ms (✅ 1.04x slower)

shared_memory

copy_from_slice copy_to_slice fill
64MB 5.35 ms (✅ 1.05x faster) 5.42 ms (❌ 1.11x slower) 4.38 ms (❌ 1.17x slower)
1MB 42.31 µs (✅ 1.01x slower) 43.13 µs (✅ 1.00x faster) 41.46 µs (✅ 1.01x slower)

snapshots

restore create
default 82.39 µs (✅ 1.02x faster) 357.16 µs (✅ 1.43x faster)
large 1.62 ms (🚀 2.07x faster) 195.13 ms (✅ 1.23x faster)
medium 98.24 µs (✅ 1.18x faster) 48.58 ms (✅ 1.18x faster)
small 87.38 µs (✅ 1.00x slower) 2.90 ms (✅ 1.52x faster)

Summary

  • Biggest gain: sandboxes/create_uninitialized/large — 🚀 2.60x faster
  • Worst regression: sandboxes/create_initialized_and_drop/small — ❌ 1.98x slower
mshv3 / intel (Linux) (❌ *2.33x slower* → 🚀 **1.94x faster**)

function_call_serialization

deserialize_function_call serialize_function_call
14.40 ms (✅ 1.04x slower) 11.87 ms (❌ 1.13x slower)

guest_calls

call call_with_restore call_with_host_function interrupt_latency different_thread
small 79.07 µs (✅ 1.34x faster) 243.84 µs (✅ 1.14x faster) 115.85 µs (✅ 1.43x faster)
large 70.91 µs (✅ 1.45x faster) 331.88 µs (✅ 1.24x faster) 106.94 µs (✅ 1.65x faster)
medium 77.63 µs (✅ 1.32x faster) 236.32 µs (✅ 1.21x faster) 113.70 µs (✅ 1.49x faster)
default 76.25 µs (✅ 1.39x faster) 240.87 µs (✅ 1.15x faster) 118.87 µs (✅ 1.40x faster)
61.04 µs (✅ 1.41x faster) 82.67 µs (✅ 1.54x faster)

guest_functions_with_large_parameters

guest_call_with_large_parameters
114.29 ms (✅ 1.12x faster)

sample_workloads

24K_in_8K_out_c 24K_in_8K_out_rust
92.98 µs (✅ 1.54x faster) 94.16 µs (✅ 1.53x faster)

sandboxes

create_uninitialized create_uninitialized_and_drop create_initialized_and_drop create_initialized
medium 16.69 ms (✅ 1.69x faster) 17.05 ms (✅ 1.65x faster) 147.48 ms (❌ 2.33x slower) 20.09 ms (✅ 1.56x faster)
default 362.39 µs (✅ 1.36x faster) 371.26 µs (✅ 1.39x faster) 49.84 ms (❌ 1.50x slower) 1.46 ms (✅ 1.14x faster)
large 61.97 ms (✅ 1.73x faster) 63.88 ms (✅ 1.66x faster) 224.00 ms (❌ 1.44x slower) 69.87 ms (✅ 1.61x faster)
small 2.24 ms (🚀 1.94x faster) 2.90 ms (✅ 1.55x faster) 97.52 ms (❌ 2.31x slower) 5.08 ms (✅ 1.27x faster)

shared_memory

fill copy_from_slice copy_to_slice
64MB 7.40 ms (✅ 1.02x slower) 11.69 ms (✅ 1.03x slower) 8.95 ms (✅ 1.06x faster)
1MB 33.44 µs (✅ 1.14x faster) 63.75 µs (✅ 1.02x slower) 41.50 µs (✅ 1.07x slower)

snapshots

create restore
large 162.87 ms (✅ 1.29x faster) 1.74 ms (✅ 1.50x faster)
default 354.95 µs (✅ 1.28x faster) 153.93 µs (✅ 1.03x faster)
small 4.39 ms (✅ 1.36x faster) 158.39 µs (✅ 1.03x slower)
medium 41.74 ms (✅ 1.32x faster) 169.35 µs (✅ 1.04x faster)

Summary

  • Biggest gain: sandboxes/create_uninitialized/small — 🚀 1.94x faster
  • Worst regression: sandboxes/create_initialized_and_drop/medium — ❌ 2.33x slower
hyperv-ws2025 / amd (Windows) (❌ *8.66x slower* → ✅ **1.11x faster**)

function_call_serialization

deserialize_function_call serialize_function_call
13.63 ms (❌ 1.39x slower) 12.67 ms (❌ 1.48x slower)

guest_calls

call call_with_host_function call_with_restore different_thread interrupt_latency
default 48.65 µs (❌ 1.27x slower) 78.79 µs (❌ 1.23x slower) 174.66 µs (❌ 1.26x slower)
large 53.10 µs (❌ 1.29x slower) 93.00 µs (❌ 1.43x slower) 461.88 µs (❌ 1.78x slower)
medium 51.24 µs (❌ 1.40x slower) 96.78 µs (❌ 1.44x slower) 229.54 µs (❌ 1.40x slower)
small 50.33 µs (❌ 1.33x slower) 78.57 µs (❌ 1.26x slower) 187.71 µs (❌ 1.30x slower)
78.47 µs (✅ 1.04x slower) 153.79 µs (❌ 5.12x slower)

guest_functions_with_large_parameters

guest_call_with_large_parameters
4.74 s (❌ 1.44x slower)

sample_workloads

24K_in_8K_out_c 24K_in_8K_out_rust
98.65 µs (❌ 1.34x slower) 82.02 µs (❌ 1.17x slower)

sandboxes

create_initialized_and_drop create_initialized create_uninitialized_and_drop create_uninitialized
default 8.24 ms (❌ 1.64x slower) 6.29 ms (❌ 1.48x slower) 1.39 ms (❌ 1.24x slower) 1.50 ms (✅ 1.00x faster)
large 405.55 ms (❌ 1.21x slower) 320.54 ms (✅ 1.07x slower) 388.69 ms (✅ 1.04x slower) 311.94 ms (✅ 1.11x faster)
medium 104.65 ms (❌ 1.23x slower) 107.36 ms (❌ 1.42x slower) 95.10 ms (✅ 1.00x slower) 79.46 ms (✅ 1.10x faster)
small 20.52 ms (❌ 1.32x slower) 17.29 ms (❌ 1.28x slower) 13.25 ms (✅ 1.04x slower) 15.18 ms (❌ 1.27x slower)

shared_memory

copy_from_slice copy_to_slice fill
1MB 50.56 µs (❌ 1.20x slower) 51.72 µs (❌ 1.15x slower) 44.61 µs (✅ 1.09x slower)
64MB 18.59 ms (❌ 2.23x slower) 18.27 ms (❌ 1.93x slower) 12.48 ms (❌ 2.02x slower)

snapshots

create restore
default 872.32 µs (❌ 1.17x slower) 101.89 µs (❌ 1.40x slower)
large 446.46 ms (❌ 1.18x slower) 59.04 ms (❌ 1.51x slower)
medium 120.42 ms (❌ 1.35x slower) 2.29 ms (❌ 8.66x slower)
small 15.43 ms (❌ 1.36x slower) 115.82 µs (❌ 1.74x slower)

Summary

  • Biggest gain: sandboxes/create_uninitialized/large — ✅ 1.11x faster
  • Worst regression: snapshots/restore/medium — ❌ 8.66x slower
hyperv-ws2025 / intel (Windows) (❌ *3.32x slower* → ✅ **1.24x faster**)

function_call_serialization

deserialize_function_call serialize_function_call
16.46 ms (❌ 1.12x slower) 13.95 ms (❌ 1.19x slower)

guest_calls

call call_with_host_function call_with_restore different_thread interrupt_latency
default 77.41 µs (✅ 1.04x slower) 130.89 µs (✅ 1.08x slower) 234.50 µs (✅ 1.05x faster)
large 89.57 µs (✅ 1.01x slower) 127.20 µs (❌ 1.19x slower) 784.62 µs (❌ 1.38x slower)
medium 79.85 µs (✅ 1.03x faster) 139.73 µs (✅ 1.05x slower) 353.80 µs (❌ 1.19x slower)
small 79.49 µs (✅ 1.02x faster) 134.30 µs (✅ 1.08x slower) 254.65 µs (✅ 1.06x slower)
100.42 µs (✅ 1.14x faster) 148.20 µs (❌ 2.24x slower)

guest_functions_with_large_parameters

guest_call_with_large_parameters
4.71 s (✅ 1.01x slower)

sample_workloads

24K_in_8K_out_c 24K_in_8K_out_rust
97.65 µs (✅ 1.24x faster) 97.15 µs (✅ 1.23x faster)

sandboxes

create_initialized_and_drop create_initialized create_uninitialized_and_drop create_uninitialized
default 7.45 ms (✅ 1.04x faster) 6.10 ms (✅ 1.04x slower) 1.60 ms (❌ 1.31x slower) 1.42 ms (❌ 1.28x slower)
large 383.32 ms (✅ 1.02x faster) 320.47 ms (✅ 1.02x faster) 408.89 ms (❌ 1.17x slower) 301.74 ms (✅ 1.02x faster)
medium 111.60 ms (✅ 1.07x slower) 84.30 ms (✅ 1.03x faster) 97.63 ms (❌ 1.12x slower) 73.95 ms (✅ 1.07x faster)
small 20.50 ms (✅ 1.01x faster) 17.24 ms (✅ 1.01x faster) 14.45 ms (❌ 1.18x slower) 10.59 ms (✅ 1.01x faster)

shared_memory

copy_from_slice copy_to_slice fill
1MB 70.83 µs (❌ 1.16x slower) 82.24 µs (❌ 1.16x slower) 37.47 µs (❌ 1.15x slower)
64MB 14.75 ms (✅ 1.03x slower) 17.81 ms (✅ 1.10x slower) 11.50 ms (❌ 1.15x slower)

snapshots

create restore
default 810.67 µs (✅ 1.24x faster) 134.78 µs (✅ 1.11x faster)
large 433.71 ms (✅ 1.10x faster) 83.49 ms (❌ 1.45x slower)
medium 112.00 ms (✅ 1.09x faster) 2.99 ms (❌ 3.32x slower)
small 14.84 ms (✅ 1.06x faster) 161.35 µs (❌ 1.11x slower)

Summary

  • Biggest gain: snapshots/create/default — ✅ 1.24x faster
  • Worst regression: snapshots/restore/medium — ❌ 3.32x slower

@hyperlight-dev hyperlight-dev deleted a comment from hyperlight-gh-bot Bot Jun 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/enhancement For PRs adding features, improving functionality, docs, tests, etc.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant